In these exercises you will use some of the functions from the tidyr package to make datasets tidy.
In several of our exercises (incl. this one) we will use data on global life expectancy from Gapminder and the Titanic dataset from Kaggle. In addition, for one of the exercises on tidy data, we will use an excerpt from NationMaster data on murder and intentional homicide for 2010.
First, copy the following code into a new R script and run it to load/generate the datasets we will use in these exercises. For the moment, you do not have to understand all all of the following code (but you should be able to at the end of this workshop). Please note that in order for the code to run, your working directory should be the folder with the workshop materials (use getwd to print your current working directory if you are unsure and setwd to change it if necessary).
library(tidyverse)
gap_life <- read_csv("../data/gapminder/life_expectancy_years.csv")
titanic <- read_csv("../data/titanic/titanic.csv")
crime <- tibble(country = rep(c("Germany", "Brazil", "Norway"), 2),
crime = c(rep("murders", 3), rep("intentional homicide rate", 3)),
year = 2010,
value = c(690, 40974, 29, 0.84, 27, 0.68))
You should gather the years into one column/variable. If you are unsure about the arguments of a function, you can always consult the help files by typing (and running) a ? directly followed by the function name (e.g., ?glimpse). NB: This only works if you have previously loaded the package that includes the function.
gather() function.
spread the crime variable.